The fibonacci sequence part 1: From O(2^n) to O(n)
Photo by Yousef Espanioly on Unsplash

The classic dynamic programming question involves finding the nth number of the fibonacci sequence. In this article, we’ll look into 2 algorithms, ranging from O(2^n) to to O(n). In the next article we’ll look at how to go from O(n) to the blazingly fast O (log n).

What is the fibonacci sequence?

The fibonacci sequence is a sequence of numbers following two simple rules:

  1. The first two numbers are both 1
  2. Every subsequent number is the sum of the previous two numbers.

So, the sequence looks like:

1, 1, 2, 3, 5, 8, 13, 21, 34, 55, …

Our goal is to create a function fib(n) that inputs a number n and outputs the nth number of the fibonacci sequence.

The basic recursive solution

Since we have defined the sequence recursively (each term is defined as a function of the previous terms), we can intuitively encode those two rules into code.

def fib(n):
  if (n == 0 or n == 1): return 1 # The first two numbers should be 1
  return fib(n - 1) + fib(n - 2) # The subsequent numbers are the sum of the previous two numbrs

And that’s all you need. Though, if you try to run this, you’ll find that as the n increases, the algorithm very quickly slows down.

For n = 40, it took about 9.5 seconds to calculate:

CPU times: user 9.48 s, sys: 5.51 ms, total: 9.48 s
Wall time: 9.56 s

For n = 50, I gave up before waiting for the algorithm to give out a value. But why is it so slow?

Time complexity for traditional recursive fibonacci algorithm

Time complexity is a programmer’s way of measuring how slow an algorithm becomes as the input size increases. In this case, our input is just a single number n.

If n is either 1 or 2, our time complexity is just constant, since the algorithm just has to run the if statement and return 1. So, let’s consider any n greater than 2. For any general n, each function call of the fib function runs two other calls of the same fib function. So if we have n = 3, we’ll run two different function calls. If n = 4, we’ll run two different function calls, each of which might run two other function calls. From this, you may intuitively feel that the number of function calls grows to the power of 2.

For n = 3, we have 2 function calls

For n = 4, we have 2*2 function calls (one for n = 3 and n = 2, then the call with n = 3 will again run two other calls for n = 1 and n = 2).

If this is getting confusing, we can visualize this through a chart.

Suppose that n = 7. Then the algorithm will run two calls with n = 6 and then n = 5. The function with n = 6 will then run two other calls for n = 5 and n = 4, this goes on until we get to the functions with n = 1 and n = 2, which will return 1.

This is a graph structure. The node at the top is called root of the graph, and all the nodes at the end are called leaves. The function ends at leaves. i.e no more functions are called here. Now, the functions at the leaves will simply return 1, because this is the if condition that we had put in the definition as: if(n == 0 or n == 1): return 1.

Once, this occurs, the function right above the leave nodes, will add them, beause we’ve defined it as: fib(n-1) + fib(n-2) . So, the function with n = 3 will return 1+1 = 2 as its return value.

Then we go up one level in the graph. The same thing occurs. The function with n = 2 will simply return 1, and we’ve just calculated the return value for the function with n = 3. Finally, the function with n = 4 will add both of them and will get: 1 + 2 = 3.

This continues till we get to the root where we perform the final calculation: 8+5 = 13.

What does a time complexity of O(2^n)

At this point it is easy to explain why the time complexity of the algorithm is. O(2^n). This time complexity means that if our input n increases in number, the time taken by our algorithm will increase with exponential with base 2. So with n = 10, we need about 2¹⁰ steps or 1024 steps. With n = 20, we require 2²⁰ or 1048576 steps. With 2⁵⁰, we get a number with 15 zeros. Even if each step takes 1 nanosecond (1 billionth of a second), the total time our algorithm will take is 312 hours.

Why is the time complexity of this algorithm O(2^n)

As you can see, at each node in the graph, the graph splits into two parts, which again split into two further sub parts and so on. Each part of the graph is called a subtree. So, with each n, our tree grows twice as big.

How can we improve?

Look at the tree once again. Notice how a lot of the subtrees are repeated. The subtree starting at the node with 5 is actually repeated twice in our graph. This means that our algorithm goes through that subtree twice. That’s a waste of time isn’t it? A bigger waste is that even more subtrees are repeated.

The subtree starting at node with the value of 4 is actually repeated thrice. So, we’re going through the subtree 3 times.

Now imagine the algorithm with the initial n = 100. Some parts of the computation will be done twice, thrice, 4 times, and so on. How much of a waste is that!? What we need is a way to store results we’ve already computed, so rather than going through the tree again and again, we can first look into our notebook and use the value already stored in there. This is called memoization.

Fibonacci algorithm with memoization

Memoization is a process of created a memo. A memo is just a place where we store things we want to remember later.

We define this memo as a dictionary structure in python. The keys will be the various values of n and the values of the dictionary will be the output of our fib function.

We already know the first two values of the fibonacci sequence. So we can simply add that to our memo. Then at each step, our base case will be to check whether n is a key of our memo, if it us, we just return the value stored in our memo. Otherwise, we calculate the value of n as before and then store it in our memo and return it:

memo = {1: 1, 2: 1}
def fib(n):
  if (n in memo): return n
  
  memo[n] = fib(n - 1) + fib(n - 2)
  return memo[n]

This guy is so much more quicker than what we had before. Running fib(50) , we get our solutions in just 4 microsecond. That is a millionth of a second

CPU times: user 4 µs, sys: 0 ns, total: 4 µs
Wall time: 5.96 µs
12586269025

This is fast to the point that we can run fib(1000) no problem:

CPU times: user 254 µs, sys: 83 µs, total: 337 µs
Wall time: 339 µs
43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875

Just 337 microseconds.

Why is memoization faster?

As I said before, the entire idea of memoization is to store calculations so we don’t have to do them again. If we look at our tree, once we’ve gone through a subtree, we’ll store the result of that calculation in our memo, so the next time we go through that subtree again, we’ll quick-return the result from our memo rather than going through the tree.

In effect, we won’t even touch all the shaded out subtrees since we’ve already calculated the result of traversing them. In the end, all we have is a single chain going from 7 to 2 and 1. Of course, at each step we would need to look up the memo to see if we have a solution, so that will take some time, but that is a constant time and won’t change as our input size changes, so we ignore them. Finally, since we have a single chain going from our input n to 1, we say that this algorithm has a time complexity of O(n). This means that the time our algorithm takes to compute will grow proportionally as our input increases, not exponentially.

Conclusion

In this article, we explored two algorithms for finding the nth Fibonacci number, starting with a basic recursive approach with exponential time complexity, O(2^n), and improving it using memoization to achieve linear time complexity, O(n). Memoization dramatically reduced computation time by storing previously computed results, transforming the algorithm from impractically slow to efficiently fast. In the next article, we will further optimize our Fibonacci calculation with an even faster algorithm, achieving O(log n) time complexity. Stay tuned for this advanced technique.